Allow custom OptimizerHints by altmannmarcelo · Pull Request #2216 · apache/datafusion-sqlparser-rs

altmannmarcelo · 2026-02-12T23:22:36Z

Two orthogonal improvements to optimizer hint parsing:

Option<OptimizerHint> -> Vec<OptimizerHint>: the old Option silently dropped all but the first hint-style comment. Vec preserves all hint comments the parser encounters, letting consumers decide which to use. This is backwards compatible: optimizer_hint: None becomes optimizer_hints: vec![], and optimizer_hint.unwrap() becomes optimizer_hints[0].
Generic prefix extraction: the /*+...*/ pattern is an established convention. Various systems extend it with /*prefix+...*/ where the prefix is opaque alphanumeric text before +. Rather than adding a new dialect flag or struct for each system, the parser now captures any [a-zA-Z0-9]* run before + as a prefix field. Standard hints have prefix: "". No new dialect surface -- same supports_comment_optimizer_hint() gate. This makes OptimizerHint a generic extension point: downstream consumers can define their own prefixed hint conventions and filter hints by prefix, without requiring any changes to the parser or dialect configuration.

…efix Two orthogonal improvements to optimizer hint parsing: 1. `Option<OptimizerHint>` -> `Vec<OptimizerHint>`: the old Option silently dropped all but the first hint-style comment. Vec preserves all hint comments the parser encounters, letting consumers decide which to use. This is backwards compatible: `optimizer_hint: None` becomes `optimizer_hints: vec![]`, and `optimizer_hint.unwrap()` becomes `optimizer_hints[0]`. 2. Generic prefix extraction: the `/*+...*/` pattern is an established convention. Various systems extend it with `/*prefix+...*/` where the prefix is opaque alphanumeric text before `+`. Rather than adding a new dialect flag or struct for each system, the parser now captures any `[a-zA-Z0-9]*` run before `+` as a `prefix` field. Standard hints have `prefix: ""`. No new dialect surface -- same `supports_comment_optimizer_hint()` gate. This makes OptimizerHint a generic extension point: downstream consumers can define their own prefixed hint conventions and filter hints by prefix, without requiring any changes to the parser or dialect configuration.

xitep

hello @altmannmarcelo,

i quite like your generalisation if it makes the feature useful for dialects supporting multiple hints! 👍 so i'm in for the change.

however, i wish you would introduce dialect flags to guide the parser to 1) allow the prefixes, and 2) allow accepting multiple hints (the AST can nicely present that with your suggestion of Vec<OptimizerHint>.)

yes, you wrote that downstream programs can do the validation themselves. and indeed they can. but, when you start writing your 3rd sql processor based on sqlparser, it become tiresome to repeat those validations in all of them (or to start maintaining a separate crate for these validations.) the great thing about sqlparser is that it has the concept of "dialects" and provides a common AST for all of them, yet is able to distinguish between the dialects' "idiosyncrasies." (having said that, i'm no authority and don't have a say in how sqlparser-rs wants to evolve.)

xitep · 2026-02-13T06:35:08Z

src/parser/mod.rs

+            Some((before_plus.to_string(), text.to_string()))
+        } else {
+            None
+        }


i think using str::split_once would make this shorter and leaner (possibly slightly more efficient), e.g. https://gist.github.com/rust-play/146d81960095525d6384f34d84ac7419

xitep · 2026-02-13T06:36:35Z

tests/sqlparser_oracle.rs

+    assert_eq!(select.optimizer_hints.len(), 2);
+    assert_eq!(select.optimizer_hints[0].text, "one two three");
+    assert_eq!(select.optimizer_hints[0].prefix, "");
+    assert_eq!(select.optimizer_hints[1].text, "not a hint!");


well, this test was to assert that "not a hint!" is in fact "not an optimizer hint!" :)

altmannmarcelo · 2026-02-13T10:07:18Z

Thanks for the review @xitep, and glad you like the generalization!

I understand the appeal of dialect flags — they're great when the parser can definitively say "this syntax is valid/invalid for dialect X." However, I think optimizer hints are a different category. The /*+ ... */ pattern is an established convention precisely because it's a standard way to extend functionality without requiring grammar changes. Different systems (and even different versions of the same system)
recognize different hint keywords and prefixes, and the parser shouldn't need to know about all of them.

The current approach keeps the parser's responsibility focused: if the dialect supports optimizer hints (supports_comment_optimizer_hint()), collect them all and let the downstream consumer decide which ones are
relevant. This is intentional — adding dialect flags for which prefixes are allowed or how many hints are accepted would couple the parser to specific hint conventions that are really consumer-level concerns.

For downstream projects, the filtering is straightforward — match on prefix and ignore what you don't recognize. This is simpler and more flexible than maintaining dialect flags that would need updating as hint
conventions evolve. It also means new prefix conventions can be adopted by consumers without any changes to sqlparser-rs.

xitep · 2026-02-13T11:23:11Z

i agree it will make the maintenance of sqlparser simpler by pushing the responsibilities downstream. and yes, in this case, with the optimizer hints, it is straightforward.

altmannmarcelo force-pushed the extend_optimizer_hints_upstream branch from 0a5df55 to 4e2c3ac Compare February 12, 2026 23:31

xitep reviewed Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow custom OptimizerHints#2216

Allow custom OptimizerHints#2216
altmannmarcelo wants to merge 1 commit intoapache:mainfrom
altmannmarcelo:extend_optimizer_hints_upstream

altmannmarcelo commented Feb 12, 2026

Uh oh!

xitep left a comment •

edited

Loading

Uh oh!

xitep Feb 13, 2026 •

edited

Loading

Uh oh!

xitep Feb 13, 2026

Uh oh!

altmannmarcelo commented Feb 13, 2026

Uh oh!

xitep commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

altmannmarcelo commented Feb 12, 2026

Uh oh!

xitep left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xitep Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xitep Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

altmannmarcelo commented Feb 13, 2026

Uh oh!

xitep commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xitep left a comment •

edited

Loading

xitep Feb 13, 2026 •

edited

Loading